Amazon CloudFront
Detailed Content
Amazon CloudFront is a fast content delivery network (CDN) service that securely delivers data, videos, applications, and APIs to customers globally with low latency and high transfer speeds. CloudFront integrates with other AWS products to give developers and businesses an easy way to accelerate content to end users with no minimum commitments.
Core Concepts and Features
- Distributions: The core configuration entity in CloudFront. A distribution tells CloudFront where to get your content (origins) and how to deliver it to users (behaviors, caching rules).
- Web Distribution: Used for HTTP/HTTPS content (websites, web applications, streaming media).
- RTMP Distribution: (Legacy) Used for streaming media using Adobe Flash Media Server's RTMP protocol.
- Edge Locations (Points of Presence - PoPs): Globally distributed data centers where CloudFront caches copies of your content. When a user requests content, it's served from the nearest edge location, reducing latency.
- Origins: The location where CloudFront fetches your content. Origins can be:
- Amazon S3 Bucket: For static website content, images, videos, etc.
- HTTP Server: Any HTTP server, including an EC2 instance, an Elastic Load Balancer, or an on-premises server.
- MediaStore Container: For live and on-demand video streaming.
- Cache Behavior: Defines how CloudFront handles requests for specific URL paths. You can configure caching policies, viewer protocol policies, origin request policies, and more.
- Caching: CloudFront caches content at edge locations to reduce the load on your origins and improve delivery speed. You can control caching behavior using cache policies (TTL, headers, cookies, query strings).
- Viewer Protocol Policy: Determines the protocol (HTTP or HTTPS) that viewers can use to access your content from CloudFront.
HTTP and HTTPSRedirect HTTP to HTTPSHTTPS Only
- Origin Protocol Policy: Determines the protocol (HTTP or HTTPS) that CloudFront uses to fetch content from your origin.
- Signed URLs and Signed Cookies: Provide a way to control access to your private content in CloudFront. You can restrict access to specific users, for a limited time, or from specific IP addresses.
- Field-Level Encryption: Allows you to encrypt specific data fields in an HTTP request at the edge of your network. This ensures that sensitive data is encrypted throughout the entire application stack.
- AWS WAF Integration: CloudFront integrates with AWS WAF (Web Application Firewall) to protect your web applications from common web exploits and bots at the edge.
- Lambda@Edge: Allows you to run Lambda functions at CloudFront edge locations in response to CloudFront events (viewer request, origin request, origin response, viewer response). This enables customization of content delivery, A/B testing, dynamic content generation, and more.
- Origin Access Control (OAC) / Origin Access Identity (OAI): (OAI is legacy, OAC is preferred) A special CloudFront identity that you can associate with your distribution to grant CloudFront permission to fetch private content from an S3 bucket. This prevents users from bypassing CloudFront and accessing your S3 content directly.
Use Cases
- Accelerating Static Website Content: Deliver HTML, CSS, JavaScript, images, and other static assets from S3 buckets with low latency to global users.
- Streaming Video and Audio: Deliver live and on-demand video and audio content with high performance and scalability.
- Accelerating Dynamic Content: Improve the performance of dynamic web applications and APIs by routing requests over the AWS global network and caching API responses.
- Securing Web Applications: Integrate with AWS WAF to protect against common web exploits and DDoS attacks at the edge. Use Signed URLs/Cookies to restrict access to premium content.
- Global Application Delivery: Serve applications to a global user base, ensuring a consistent and fast experience regardless of geographic location.
- Customizing Content Delivery: Use Lambda@Edge to modify requests and responses, perform A/B testing, implement custom authentication, or generate dynamic content at the edge.
- Software Downloads: Distribute software updates and large files efficiently and reliably to users worldwide.
Interview Questions
Conceptual Questions
- What is Amazon CloudFront and what problem does it solve?
- Amazon CloudFront is a fast content delivery network (CDN) service. It solves the problem of delivering content (data, videos, applications, APIs) to users globally with low latency and high transfer speeds by caching content at edge locations closer to users and routing requests over the optimized AWS global network.
- Explain the role of Edge Locations and Origins in CloudFront.
- Edge Locations: Globally distributed data centers where CloudFront caches copies of your content. They serve content to users from the nearest location.
- Origins: The backend location where CloudFront fetches content that is not in its cache. This can be an S3 bucket, an HTTP server (e.g., ALB, EC2), or a MediaStore container.
- What is the purpose of Signed URLs and Signed Cookies in CloudFront? When would you use one over the other?
- Signed URLs and Signed Cookies are used to control access to private content in CloudFront. They allow you to restrict access to specific users, for a limited time, or from specific IP addresses.
- Signed URLs: For individual files or when you want to restrict access to a single file.
- Signed Cookies: For multiple restricted files or when you want to provide access to all files in a private directory.
- How does CloudFront integrate with AWS WAF and Lambda@Edge?
- AWS WAF: CloudFront integrates with AWS WAF to protect web applications from common web exploits and DDoS attacks at the edge, before requests reach your origin.
- Lambda@Edge: Allows you to run Lambda functions at CloudFront edge locations in response to CloudFront events. This enables customization of content delivery, dynamic content generation, and custom authentication/authorization logic at the edge.
- Explain the difference between Viewer Protocol Policy and Origin Protocol Policy.
- Viewer Protocol Policy: Controls the protocol (HTTP or HTTPS) that viewers can use to access your content from CloudFront. (e.g.,
Redirect HTTP to HTTPS). - Origin Protocol Policy: Controls the protocol (HTTP or HTTPS) that CloudFront uses to fetch content from your origin server. (e.g.,
HTTPS Only).
- Viewer Protocol Policy: Controls the protocol (HTTP or HTTPS) that viewers can use to access your content from CloudFront. (e.g.,
Scenario-Based Questions
- You are hosting a static website on an S3 bucket, and you want to improve its loading speed for users worldwide while also securing the content. How would you configure CloudFront for this?
- I would create a CloudFront Web Distribution with the S3 bucket as its origin. I would configure the Viewer Protocol Policy to
Redirect HTTP to HTTPSto ensure secure communication. To further secure the S3 content and prevent direct access, I would create an Origin Access Control (OAC) and configure the S3 bucket policy to only allow access from this OAC. CloudFront would then cache the static content at its edge locations, providing low-latency access to users globally.
- I would create a CloudFront Web Distribution with the S3 bucket as its origin. I would configure the Viewer Protocol Policy to
- Your application serves premium video content, and you need to ensure that only authorized, paying subscribers can access the videos for a limited time. How would you implement this access control using CloudFront?
- I would use CloudFront Signed URLs or Signed Cookies. For individual video files, I would generate a Signed URL for each video, embedding an expiration time and potentially IP address restrictions. For a collection of videos or a directory, Signed Cookies would be more suitable, allowing a user to access multiple restricted files after authentication. The application would handle user authentication and then generate the appropriate signed URL or set the signed cookie.
- You have an API endpoint that needs to perform custom authentication logic before forwarding requests to your backend. This authentication logic is implemented as a Lambda function. How can you integrate this with CloudFront to execute the Lambda function at the edge?
- I would use Lambda@Edge. I would create a Lambda function containing my custom authentication logic. Then, I would associate this Lambda function with my CloudFront distribution's Viewer Request event. When a user makes a request to CloudFront, the Lambda function would execute at the nearest edge location, perform the authentication, and either allow the request to proceed to the origin or return an unauthorized response directly from the edge.
- Your web application is experiencing a high volume of malicious requests, including SQL injection attempts and cross-site scripting (XSS) attacks. How can you protect your application at the edge using CloudFront?
- I would integrate AWS WAF (Web Application Firewall) with my CloudFront distribution. I would create a Web ACL in WAF and associate it with the CloudFront distribution. Within the Web ACL, I would add AWS Managed Rule Groups (e.g.,
AWSManagedRulesCommonRuleSet,AWSManagedRulesSQLiRuleSet,AWSManagedRulesXSSRuleSet) to detect and block common web exploits. I could also add custom rules to block specific IP addresses or patterns of malicious traffic.
- I would integrate AWS WAF (Web Application Firewall) with my CloudFront distribution. I would create a Web ACL in WAF and associate it with the CloudFront distribution. Within the Web ACL, I would add AWS Managed Rule Groups (e.g.,
Coding/CLI Examples
Here are some common CloudFront operations using the AWS CLI and Python (Boto3).
AWS CLI Examples
-
Create a CloudFront Web Distribution with an S3 origin: ```bash # Assume an S3 bucket 'my-static-website-bucket' exists and is configured for static website hosting # Also assume an OAC is created and associated with the bucket policy
aws cloudfront create-distribution \ --distribution-config '{ \ "CallerReference": "$(date +%s)", \ "Comment": "My CloudFront Distribution for S3", \ "Enabled": true, \ "Origins": { \ "Quantity": 1, \ "Items": [ \ { \ "Id": "MyS3Origin", \ "DomainName": "my-static-website-bucket.s3.amazonaws.com", \ "S3OriginConfig": { \ "OriginAccessIdentity": "" \ }, \ "OriginAccessControlId": "your-oac-id" # Replace with your OAC ID \ } \ ] \ }, \ "DefaultCacheBehavior": { \ "TargetOriginId": "MyS3Origin", \ "ViewerProtocolPolicy": "redirect-to-https", \ "AllowedMethods": { \ "Quantity": 2, \ "Items": ["GET", "HEAD"], \ "CachedMethods": { \ "Quantity": 2, \ "Items": ["GET", "HEAD"] \ } \ }, \ "ForwardedValues": { \ "QueryString": false, \ "Cookies": {"Forward": "none"} \ }, \ "MinTTL": 0, \ "DefaultTTL": 86400, \ "MaxTTL": 31536000 \ }, \ "ViewerCertificate": { \ "CloudFrontDefaultCertificate": true \ } \ }' ```
-
Invalidate cached objects in a CloudFront distribution: ```bash DISTRIBUTION_ID="E1234567890ABC" # Replace with your Distribution ID
aws cloudfront create-invalidation \ --distribution-id $DISTRIBUTION_ID \ --paths "/images/*" "/index.html" ```
-
Create a Signed URL for a private S3 object via CloudFront: ```bash # This requires a CloudFront key pair. Generate one in IAM -> CloudFront key pairs. # Store the private key file (pk-xxxxxxxxxxxx.pem) and key pair ID (APKAxxxxxxxxxxxx).
Example using openssl and a custom script (not direct CLI)
You would typically use an SDK for this.
Example Python script (requires boto3 and cryptography library)
import datetime
from botocore.signers import CloudFrontSigner
from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives import hashes
from cryptography.hazmat.primitives.asymmetric import padding
from cryptography.hazmat.primitives import serialization
def rsa_signer(message):
with open("pk-APKAxxxxxxxxxxxx.pem", "rb") as key_file:
private_key = serialization.load_pem_private_key(
key_file.read(),
password=None,
backend=default_backend()
)
return private_key.sign(message, padding.PKCS1v15(), hashes.SHA1())
key_id = "APKAxxxxxxxxxxxx" # Your CloudFront Key Pair ID
url = "https://d111111abcdef8.cloudfront.net/private/document.pdf" # Your CloudFront URL
expire_date = datetime.datetime(2024, 1, 1)
cf_signer = CloudFrontSigner(key_id, rsa_signer)
signed_url = cf_signer.generate_presigned_url(url, date_less_than=expire_date)
print(signed_url)
```
Python (Boto3) Examples
First, ensure you have Boto3 installed (pip install boto3) and your AWS credentials configured.
-
Create a CloudFront Web Distribution with an S3 origin: ```python import boto3 import time
cf_client = boto3.client('cloudfront')
s3_bucket_name = "my-boto3-static-website-bucket-12345" # REPLACE with your S3 bucket name oac_id = "your-oac-id" # REPLACE with your Origin Access Control ID
try: response = cf_client.create_distribution( DistributionConfig={ 'CallerReference': str(int(time.time())), 'Comment': 'My Boto3 CloudFront Distribution for S3', 'Enabled': True, 'Origins': { 'Quantity': 1, 'Items': [ { 'Id': 'MyS3Origin', 'DomainName': f"{s3_bucket_name}.s3.amazonaws.com", 'OriginAccessControlId': oac_id, 'S3OriginConfig': { 'OriginAccessIdentity': '' # Use OAC instead of OAI } }, ] }, 'DefaultCacheBehavior': { 'TargetOriginId': 'MyS3Origin', 'ViewerProtocolPolicy': 'redirect-to-https', 'AllowedMethods': { 'Quantity': 2, 'Items': ['GET', 'HEAD'], 'CachedMethods': { 'Quantity': 2, 'Items': ['GET', 'HEAD'] } }, 'ForwardedValues': { 'QueryString': False, 'Cookies': {'Forward': 'none'} }, 'MinTTL': 0, 'DefaultTTL': 86400, 'MaxTTL': 31536000 }, 'ViewerCertificate': { 'CloudFrontDefaultCertificate': True } } ) distribution_id = response['Distribution']['Id'] print(f"Created CloudFront Distribution: {distribution_id}") except Exception as e: print(f"Error creating distribution: {e}") ```
-
Create an Invalidation for a CloudFront distribution: ```python import boto3 import time
cf_client = boto3.client('cloudfront')
distribution_id = "E1234567890ABC" # REPLACE with your Distribution ID paths_to_invalidate = ["/images/*", "/index.html"]
try: response = cf_client.create_invalidation( DistributionId=distribution_id, InvalidationBatch={ 'Paths': { 'Quantity': len(paths_to_invalidate), 'Items': paths_to_invalidate }, 'CallerReference': str(int(time.time())) } ) invalidation_id = response['Invalidation']['Id'] print(f"Created invalidation {invalidation_id} for distribution {distribution_id}.") except Exception as e: print(f"Error creating invalidation: {e}") ```